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SUBJ: uPRISM - The Final Chapter 
EXECUTIVE SUMMARY 

During Ql, uPRISM CPU chips were fabricated and evaluated. Two design 
bugs were uncovered, but they did not seriously impair operation or 
evaluation. Parametric evaluation shows that the device exceeds the 
design target by at least 20%, allowing for potential production of 50MHz 
and 70MHz versions. As such, uPRISM is the fastest microprocessor extant 
in a commercial technology. Given the current directions in engineering, 
this technological advantage will not be realized in a DEC product. 

BACKGROUND 

When the decision to cancel the overall PRISM program was made, the 
first-pass uPRISM CPU chip was in the Hudson process line. SEG management 
agreed to allow the uPRISM team to continue with the fabrication and debug 
of the CPU chip during Ql as a wrap-up for the project. This memo 
documents the debug results and represents our final project report as a 
team. 



DEBUG and EVALUATION RESULTS 

Although the uPRISM die size is large (9.5mm X 13.5mm) and complex (294K 
xtors), initial yield has been quite good. Processing was completed on 
only 6 wafers and from the best one of those wafers we found four fully 
functional die. Functional testing was reasonably comprehensive except 
for exhaustive testing of the on-chip caches. Two functional bugs were 
uncovered. The first bug was a shorted bit on one of the internal busses 
due to a DRC oversight. The second bug involved the control logic for 
writing the register file from one of the two internal write busses. The 



source of the problem has been narrowed to a particular shift register, 
but the exact cause has not been determined. Neither of these bugs 
precludes testing the major functional units or in fact precludes running 
programs in the chip. 

Parametric testing was also completed on the available die. These wafers 
showed fundamental electrical parameters which placed them on the slow 
side of CMOS-2 process parameter distribution, somewhat faster than worst 
case but slower than typical. The current version of the AdvanTest 3381 
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,"" 1S ipited to frequencies not much greater than 100MHz. Since 
uPRISM requires a 2X clock, there was difficulty in trying to exercise the 
2r^ S a 5° Ve 50MHz - However ' ifc "" possible to coax the tester up to 
ll'^Tr K d r iCe °P er ? tlon - w e were able to raise the junction temperature 
to 125C before experiencing problems with clock jitter which we believe 
was due to the poor clock signals being received by the chip (ie., not a 
fundamental speed limitation in the device). Nevertheless, extrapolation 
fho m *oe Se - pa ^ t 1 lcul ? r devices to the overall process range implies that 
the design is at least capable of supporting a slow bin of 50MHz (20nS 

o^r^tinrc^fons.^^ MB ° f ? ° MHZ (14 " S CyCle ^ Under ™ rst ~ se 

ANALYSIS 

Predicting uPRISM performance from the clock rate data is somewhat 
difficult at this time due to the fact that a quality compiler is not 
available. Because the uPRISM implementation is heavily pipelined, real 
Sfnc rma S Ce i d ?P en J\ more on sophisticated compilation than standard RISC 
eSt^t;n y fif °J ^ 6 C - dG aerated by existing compilers with some 
Tf a i . future improvements nets an effective TPI of between 1 7 
and 2.0 for integer operations. Because floating point latencies are much 
?i 9 £ er l h ll i nt ^er, the effective TPI for heavy floating point rises 51 
to 4. With high content floating point code, performance is limited bv 
the CPU-FPU-CACHE bus bandwidth. Note that floating point performance 5 
predicated on the uPRISM FPU chip (a modification of the RIGEL FPU) the 
design of which was halted when the PRISM program was canceled. Also, 
application of the uPRISM CPU at speeds higher than 40MHz requires a 
custom cache interface chip, a project canceled when Emerald (uPRISM XMI 
system) was canceled. 

The following table compares uPRISM with contemporary CMOS RISC MPU's 
When multiple speed versions are available, the table assumes the fastest 
version announced (some of which are not yet actually available) 



CHIP 




CLOCK 
(MHz) 


PEAK 
MIPS 


AVG 
MIPS 


LINPACK 
MFLOPS 


UPRISM 




70 


70 


35-42 


7 (est) 


R2000 




16 


16 


10 


2.1 


R3000 




25 


25 


20 


3.8 


88000 




20 


20 


15 


- 


SPARC (FUJI 


GA) 


- 


- 


7 


— 


SPARC (CYPRUS) 


33 


33 


20(est) 


2.5 (est; 



INTEL N10 33 



50(est) 33(est) 10 (claimed) 



Note 



First pass NlO's are due out this fall - samples at 
any speed not yet available. INTEL has a history of 
not meeting initial speed targets at first silicon. 
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From the chart, it is clear that the only major performance competitor to 
uPRISM in the current time frame is the NlO, which will be the first MPU 
with multiple issue capability. In addition, recent data from MIPSCO 
indicates that the R6000 ECL MPU is being retargeted at 17-18nS. As such, 
it is not competitive with uPRISM or the NlO (same performance, much 
higher cost) . 

FUTURES 

If there were a future for uPRISM, what would it be? In our judgment it's 
too late to implement at system product with this (CMOS-2) implementation 
due to the status of FPU and C chips and software. However, this design 
could be migrated to CMOS-3 and the CPU, FPU and C-chip functionality 
merged. This would dramatically reduce floating point and cache miss 
latency and make the chip easier to use in a system. Power dissipation 
would also be reduced to more manageable levels (<10watts). With the 
right resources, this migration could be accomplished in 12 months design 
time (15 months to first samples). 

Projected performance levels 15 months from now are : 





CLOCK 


PEAK 


AVG 


LINPACK 


CHIP 


(MHz) 


MIPS 


MIPS 


MFLOPS 


UPRISM-3 


100 


100 


60 


20 


R3000 


33 


33 


26 


5 


R4000 


50 


75 


50 


10 


88000 


40 


40 


30 


- 


SPARC (CYPRUS) 


50 


50 


33 


4 


INTEL NlO 


50 


75 


50 


15 



From the chart it is clear that the NlO and the R4000 are the only close 
competition for a CMOS-3 uPRISM. Both feature multiple issue 
implementations. Since the R4000 is still under design and is a major 
departure from the previous generation R3000, its performance and 
availability must be considered somewhat skeptically. 

To regain a major performance lead on the competition will require fast 
(>100MHz) clocks and multiple issue. Multiple issue requires major 
changes to the uPRISM implementation - much more difficult than merging 
the CMOS-2 chips into a CMOS-3 implementation. Therefore, unless there is 
a strong reason to deliver a DEC designed RISC product in 18 months, and 
that certainly doesn't seem to be the case, we will drop back and consider 
targeting the 1991 time frame. In that way we can address the cosmic 
issues including - 

o multiple issue, 
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o extended addressing, 
o RISCY VAX extensions, 
o competitive sourcing, 
o ease of application, 
o the appropriate ISP and architecture. 
CONCLUSION 

The evaluation results show that we created a world class design This 
was accomplished with limited resources under difficult circumstances 
including several redirections and two cancellations. As engineers we are 
P «2?i. ? f OU f accom Plishment. As stockholders and corporate citizens, the 
UPRISM team 1S very disappointed that this work will not be used to 
improve DEC'S competitive position in the marketplace. 

I would like to thank everyone who worked directly on the project (names 
below) and those who supported us over the last three years in our various 
incarnations including HR-32. I would especially like to acknowledqe the 
work of our test engineer, Greg Papadeas, and the support of his manager, 
Suresh Nadig, for pursuing the difficult task of high speed evaluation of 
the chip with vigor even after it was clear that the program had no direct 
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